System and method for analyzing the structure of logical networks

ABSTRACT

Systems and methods for analyzing the structure of logical networks. Embodiments of the invention include ranking critical nodes according to regional hierarchies, distance hierarchies, global hierarchies, and relay hierarchies. Embodiments of the present invention are capable of testing the effectiveness of such hierarchies. In addition, critical nodes may be used to define critical regions.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.10/902,416, filed Jul. 30, 2004, which claims priority of U.S.Provisional Application No. 60/490,910, filed Jul. 30, 2003. Theseapplications are incorporated by reference in their entireties.

FEDERALLY SPONSORED DEVELOPMENT

This invention was made with U.S. Government support under grant number60NANB2D0108, awarded by the National Institute of Standards andTechnology (NIST). The U.S. Government may have certain rights in thisinvention.

FIELD OF THE INVENTION

The invention relates to systems and methods for analyzing the structureof logical networks.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 illustrates a system, according to one embodiment of the presentinvention.

FIG. 2 illustrates the method of a regional hierarchy, according to oneembodiment of the invention.

FIGS. 3-5 illustrate an example of the method of a regional hierarchy,according to one embodiment of the invention.

FIG. 6 illustrates the method of a distance hierarchy, according to oneembodiment of the invention.

FIGS. 7-8 illustrate the method of a distance hierarchy, according toone embodiment of the invention.

FIG. 9 illustrates the method of the global hierarchy, according to oneembodiment of the invention.

FIG. 10 illustrates the method of the relay hierarchy, according to oneembodiment of the invention.

FIG. 11 illustrates the method of testing the effectiveness of the nodecriticality ranking hierarchies, according to one embodiment of theinvention.

FIGS. 12-13 illustrate an example of the method of testing theeffectiveness of the node criticality ranking hierarchies, according toone embodiment of the invention.

FIG. 14 illustrates the method of defining regions by node connectivity,according to one embodiment of the invention.

FIG. 15 illustrates an example of the method of defining regions by nodeconnectivity, according to one embodiment of the invention.

DESCRIPTION OF SEVERAL EMBODIMENTS OF THE INVENTION

Embodiments of the present invention relate to systems and methods foranalyzing the structure of logical networks. The embodiments outlinedcan be used in spatial and non-spatial contexts for a variety of logicalnetwork structures.

System

FIG. 1 illustrates a system, according to one embodiment of the presentinvention. The system includes a storage database 105, which stores thedata utilized in the present invention (e.g., network data) and a userinterface 175. The network data comprises, for example, but not limitedto: satellite imagery data; digitized map data; topological map data;photo data; satellite geo-spatial data; telecommunication data;marketing data; demographic data; business data; North AmericanIndustrial Classification (NAIC) code location data; right-of-wayrouting layers data; metropolitan area fiber geo-spatial data; long haulfiber geo-spatial data; co-location facilities geo-spatial data;internet exchanges geo-spatial data; wireless towers geo-spatial data;wire centers geo-spatial data; undersea cables geo-spatial data;undersea cable landings geo-spatial data; data centers geo-spatial data;static network data; or dynamic network data; or any combination of theabove. The right-of-way routing layers data comprises, for example, butnot limited to: gas pipeline data; oil pipeline data; highway data; raildata; or electric power transmission lines data; or any combination ofthe above. The logical network data comprises, for example, but notlimited to: static network data; or dynamic network data; or anycombination of the above. The static network data comprises, forexample, but not limited to: ip network data; or network topology data;or any combination of the above. The dynamic network data comprises, forexample, but not limited to, network traffic data. The regional analysiscomprises, for example, but not limited to: continent information;nation information; state information; county information; zip codeinformation; census block information; census track information; timeinformation; metropolitan information; or functional information; or anycombination of the above. The function information comprises, forexample, but not limited to: a formula; a federal reserve bank region; atrade zone; a census region; or a monetary region; or any combination ofthe above.

Data can be obtained by performing, for example, but not limited to:purchasing data; manually constructing data; mining data from externalsources; probing networks; tracing networks; accessing proprietary data;or digitizing hard copy data; or any combination of the above.

The system also includes a ranking system 130, which can include: aregion program 155, a distance program 165, a global program 161, or arelay program 170, or any combination thereof. The region program 155 isa node criticality ranking approach which defines global connections aslinks that connect two different regions and local connections as linkswithin a region. The definition of region is fluid including geographicregions, topological regions, industrial sectors, markets, etc. Thedistance program 165 is a node criticality ranking approach whichdefines global connections as links over a certain distance thresholdand local connections as links under a certain distance threshold. Thedefinition of distance is fluid including Euclidean distance, Manhattandistance, latency, bandwidth, flow measurements etc. The global program161 is a node criticality ranking approach which looks only at thenumber of global connections utilizing either the region program 155 orthe distance program 165. The relay program 170 is a node criticalityranking approach which takes the ratio of the total capacity connectedto a node (i.e., supply) and the demand for that capacity to identifynodes that are acting as relays between large demand areas.

Regional Hierarchy

In many networks one or more nodes can be identified in a specificregion that are most critical to the operation of that region. Theregion could be geographic, non-geographic, or both. For example, in ageographic region, the most critical nodes for Internet connectivity orairline traffic in a specified geographic area could be identified. Asanother example, the network (an autonomous system) that is the mostcritical to the connectivity of financial institutions connected to theInternet could be determined. In addition, the region could be a fusionof both geographic and non-geographic areas where the region is anindividual network (autonomous systems) and the interconnection ofdifferent networks happens in specific geographic locations. In thiscase, the most critical interconnection points (i.e., nodes) of severalnetworks could be determined. Embodiments of the invention could be usedin a variety of network scenarios, including supply chains, socialnetworks, or any other logical network structure.

FIG. 2 illustrates the method of a regional hierarchy, according to oneembodiment of the invention.

In step 205, the network data is loaded into the system as one or morenodes. For example, the sample city-to-city long haul data networkillustrated in FIG. 3 could be loaded into the system. Each of the nodesin a network has a location indicated by an identifier. For example, ina geographic region, the location could be tied to a city name. In anon-geographic networks, locations can be indicated by otheridentifiers.

In step 210, each node in the network is assigned to a region based onthe node's location. The regions can be defined in a fluid manner,depending on the desires of the user. In the city-to-city long haul datanetwork example, the nodes could be allocated to census regionsillustrated in FIG. 4.

In step 215; once each node in the network has been assigned to aregion, links (i.e., connections) between nodes are designated as globalor local. Links that occur within a region are designated as locallinks, and links that connect nodes located in different regions aredesignated as global links. In the city-to-city long haul data networkexample, a connection between Atlanta, Ga. and Jacksonville, Fla. wouldbe designated as a local link because both nodes are located in theSouth Atlantic Region.

In step 220, once all links have been designated as global or locallinks, a ratio of global links to local links is taken for each node inthe network, and then weighted by the total number of links to the node.Thus, in the city-to-city long haul data network example, a ratio of onecity's (i.e., node's) global links to local links is computed, and thenthe ratio is weighted by the total connectivity of the network (i.e.,the total number of nodes in the network). This would provide anindicator of how well the city acts as a regional connector in thenetwork.

In one embodiment, this process is expressed mathematically as follows:Consider a large network of nodes n, spanning an area A consisting ofregions r, with a variable number of nodes inside each region that havea variable number of connections from each region to other regions. Fora region r with p number of nodes n, a p×p contiguity matrix representsconnections between these nodes. As illustrated in FIG. 5, a contiguityor adjacency matrix M for the entire network of m number of regions rcan be constructed as a block diagonal matrix, where matrices along themain diagonal (indicated in the boxes where there is no grid pattern)refer to the contiguity matrices for each of the regions. Interregionalconnections are represented as the off-block-diagonal elements(indicated in the boxes with a grid pattern).

If a node i in region r is connected to another node j in the sameregion, then that connection is considered as a local link and isdenoted by q_(i(r)j(r)). If node i in region r is connected to node k inregion s then that connection is considered as a global connection andis denoted by g_(i(r)k(s)). Thus, one may associate each node i(r) witha global connectivity index as a ratio between its global and localconnections, weighted by the total number of global and localconnections for the entire network.

The total number of global connections G is computed from the elementsof the upper triangular block of matrix M, of m regions, each with avariable number of nodes:

$\begin{matrix}{G = {{\sum\limits_{i{(1)}}\; {\sum\limits_{s > 1}^{m}\; {\sum\limits_{k{(s)}}\; g_{{i{(1)}}{k{(s)}}}}}} + {\sum\limits_{i{(2)}}\; {\sum\limits_{s > 2}^{m}\; {\sum\limits_{k{(s)}}g_{{i{(2)}}{k{(s)}}}}}} + \ldots + {\sum\limits_{i{({m - 1})}}\; {\sum\limits_{s > {m - 1}}^{m}\; {\sum\limits_{k{(s)}}\; g_{{i{({m - 1})}}{k{(s)}}}}}}}} & (1)\end{matrix}$

Note that, because in is the last region in the block diagonal matrix,its global connections have already been computed in the previous m−1blocks.

The total number of local connections L is a sum over all the localconnections over m regions and is given by:

$\begin{matrix}{L = {{\sum\limits_{i{(1)}}\; {\sum\limits_{{j{(1)}} > {i{(1)}}}\; q_{{i{(1)}}{j{(1)}}}}} + {\sum\limits_{i{(2)}}\; {\sum\limits_{{j{(2)}} > {i{(2)}}}\; q_{{i{(2)}}{j{(2)}}}}} + \ldots + {\sum\limits_{i{(m)}}\; {\sum\limits_{{j{(m)}} > {i{(m)}}}\; q_{{i{(m)}}{j{(m)}}}}}}} & (2)\end{matrix}$

Thus, for example, if Jacksonville, Fla. was located in the Southeastregion and had local connections to other region in the Southeast,including Orlando, Fla., Atlanta, Ga., Tallahassee, Fla., and Charlotte,N.C., but also a connection outside of the Southeast to Washington, D.C.in the Mid-Atlantic region it would have one local connections (G) andfour local connections (L). In a non-spatial context an example would beidentifying a critical autonomous system in the financial sector. TheBank of New York could have local connections to other autonomoussystems in the financial region such as Morgan Stanley and GoldmanSachs, and also have connections to autonomous systems outside of thefinancial region such as the Federal Reserve (Govt.), MCI (Telecom)Sprint (Telecom), and General Electric (Tech/Manufacturing). In thiscase the Bank of New York would have two local connections and fourglobal connections.

The global connectivity index for a node i in region r is then given by:

$\begin{matrix}{C_{i{(r)}} = {\left( \frac{\sum\limits_{s \neq r}^{m}\; {\sum\limits_{k{(s)}}\; g_{{i{(r)}}{k{(s)}}}}}{1 + {\sum\limits_{{j{(r)}},{j \neq 1}}\; q_{{i{(r)}}{j{(r)}}}}} \right) \times \left( {G + L} \right)}} & (3)\end{matrix}$

Note that the numeral of 1 in the denominator indicates a self-loop of anode.

Using the example of Jacksonville above the equation would then beplugged with G=1 and L=4 resulting in C_(i(r))=[(1/(1+4))×(1+4)]=1indicating a relatively low level of criticality in the network. Usingthe Bank of New York examples the equation would then be plugged withG=4 and L=2 resulting in C_(i(r))=[(4/(1+2))×(4+2)]=8 indicating arelatively high level of criticality in the network.

When the hierarchies above are set for the city-to-city long haul datanetwork example, the following node criticality ranking is produced:

Top Sixteen Nodes CMSA Region Score New York 135.7567108 Chicago120.3182127 San Francisco 111.5303899 Washington 98.90846075 Boston93.70275229 Dallas 92.40582839 Denver 81.42618849 St. Louis 56.1399932Cleveland 43.84487073 Louisville 41.33944954 Kansas City 39.37090433Seattle 34.70472307 Phoenix 34.70472307 Los Angeles 33.95740498 Atlanta33.68399592

Thus, the most critical nodes in the network, ranked beginning with themost critical node, are: New York, Chicago, San Francisco, Washington,etc.

Distance Hierarchy

FIG. 6 illustrates the method of the distance hierarchy, according toone embodiment of the invention. In step 605, the network data is loadedinto the system as one or more nodes.

In step 610, the distances between the nodes are defined and calculated.Distance is defined according to the desire of the user (e.g., Euclideandistance, latency, capacity, flow data). In this example, distance isdefined as Euclidean distance.

In step 615, the link between nodes is designated as global or local.The designation can be determined by automating the nodecriticality-ranking equation with an incremental set of test distances.The test distances are used to calculate the ratio of global to locallinks, weighted by the total number of links connected for eachindividual node in the network. In one embodiment, this process isexpressed mathematically as follows:

$\begin{matrix}{R = {\left( \frac{{\sum\limits_{j}\; g_{ij}} > D}{{1 + {\sum\limits_{j}\; l_{ij}}} \leq D} \right)\left( {{\sum\limits_{j}\; g_{ij}} + {\sum\limits_{j}\; l_{ij}}} \right)}} & (4)\end{matrix}$

where Σg_(ij) represents the numbered links between node i and nodeshaving a distance greater than a threshold value D; and Σl_(ij)represents the number of links between node i and nodes having adistance less than or equal to the threshold D. Using the Jacksonvilleexample again, the distance between Jacksonville and its five connectingcities would be calculated as follows: Jacksonville-Atlanta=287 miles,Jacksonville-Orlando=127 miles, Jacksonville-Tallahassee=157 miles,Jacksonville-Charlotte=339 miles, and Jacksonville-DC=647 miles. Using athreshold of 300 miles, there would be three local connections(Jacksonville-Atlanta, Jacksonville-Orlando, andJacksonville-Tallahassee) and two global connections(Jacksonville-Charlotte and Jacksonville-DC). When these numbers areplugged into the equation, the result is R=[(2/(1+3))×(2+3)]=2.5,raising the relative criticality ranking of the city from the regionalhierarchy. Distance could also be calculated by other functions such asthe flow between two nodes. If the same example looked at the tonnage ofgoods shipped between Jacksonville and its connections, the calculationwould be: Jacksonville-Atlanta=6000 tons, Jacksonville-Orlando=8000tons, Jacksonville-Tallahassee=500 tons, Jacksonville-Charlotte=1500tons, and Jacksonville-DC=250 tons. Using a threshold of 1000 tons asthe break between global and local, there would be two local connections(Jacksonville-Tallahassee and Jacksonville-DC) and three globalconnections (Jacksonville-Atlanta, Jacksonville-Orlando, andJacksonville-Charlotte). When these numbers are in turn plugged into theequation, the result is R=[(3/(1+2))×(3+2)]=5, raising the relativecriticality ranking of the city from the previous definition ofdistance. The same calculation could be done using many otherdefinitions of local and global to determine other relationships, suchas bandwidth capacity between nodes or the number of passengers using anairline route.

The test distances are then loaded in the equation and the output isgraphed for the various test distances. The inflection point of thegraphed curved is used as the distance threshold to run the hierarchy.An example of this is illustrated below using the city-to-city datanetwork utilized previously. A series of alternative distances fordistance D (e.g., 100 miles, 200 miles) are selected and used tosimulate global/local ratios utilizing the city-to-city data network:

where Dε[100, 200, 300 . . . 2700]

The simulations produce the graph presented in the FIG. 7, where thex-axis are the increments of the global/local ratio produced bydifferent values of D, and the y-axis are the percentage of nodes with aglobal to local ratio greater than one. FIG. 7 shows a sharp shift atabout 300 miles and a second shift at about 700 miles.

To find the exact point of inflection, the rate of change (i.e.,derivative) in the global to local ratio is calculated, as illustratedin FIG. 8.

The rate of change illustrated in FIG. 8 clearly points to 300 milesbeing the primary point of inflection. Under such an assumptions, alllinks shorter than 300 miles are considered local and all links over 300miles are considered global.

In step 620, the hierarchy of step 615 is utilized for each node in thenetwork to produce a criticality ranking, which ranks each nodeaccording to its global/local ratio. A sample of the out put for thehierarchy is displayed below.

Top Sixteen Nodes CMSA Global/Local Ratio Salt Lake City 342 Denver 312San Francisco 159 Dallas 94 Seattle 79 Chicago 71 Los Angeles 65 Atlanta64 Washington 62 New York 59 Phoenix 55 Houston 48 Miami 41 Boston 41Kansas City 34

Global Hierarchy

FIG. 9 illustrates the method of the global hierarchy, according to oneembodiment of the invention. This hierarchy is based on the number ofglobal connections per node. The nodes are ranked based only on thiscount. In step 905, the network data is loaded into the system as one ormore nodes.

In step 910, the distances between each node are defined and calculated.Distance is defined according to the desire of the user (e.g., Euclideandistance, latency, capacity, flow data). In this example, distance isdefined as Euclidean distance.

In step 915, the links are ranked according to the following equation

$\begin{matrix}{R_{L} = {{\sum\limits_{j}\; g_{ij}} > D}} & (5)\end{matrix}$

where R_(L)=the ranking of the link, and g_(ij) is the distance betweennodes i and j and D is a threshold distance.

This ranking provides an indicator of how many long haul globalconnections a node has, dictated by connections longer than D. (E.g., Dwas 300 miles in the sample case presented in step 215). In theJacksonville example, there were two global links in the distancehierarchy example thus R_(L)=2, or using the regional hierarchy'sdefinition of global R_(L)=1. In the financial example R_(L)=4, or inthe distance tonnage example R_(L)=3.

In step 920, the nodes are ranked based on the ranking of the linksconnected to each node.

Relay Hierarchy

FIG. 10 illustrates the method of the relay hierarchy, according to oneembodiment of the invention. This hierarchy identifies relay nodes andtheir effect on the survivability of the network. Relay nodes arelocations that are neither the ultimate origin nor destination of aninteraction across a network. The primary purpose of a relay node is toreceive flows in order to transmit them to another node with minimumdelay and cost. Nodes that act as structural links to relay informationto large markets could serve as critical junctures. The following methoddetermines which nodes are disproportionately acting as relay nodes.

In step 1005, the network data is loaded into the system as one or morenodes. In step 1010, the total capacity and demand for each node in thenetwork is determined. For the city-to-city long haul data networkexample, the total capacity and demand is the total amount of bandwidthconnected to the node (i.e., city) and the total bandwidth demand forthe node (i.e., city).

In step 1015, the ratio of capacity to demand is determined for eachnode in the network. Mathematically, this can be expressed as follows:

$\begin{matrix}{R = \frac{\sum\limits_{i = 1}^{n}\; c_{ij}}{\sum\limits_{i = 1}^{n}\; b_{ij}}} & (6)\end{matrix}$

where R=ratio of capacity to demand, c_(ij)=capacity, andb_(ij)=business demand.

The relay hierarchy could be another means used to access Jacksonville'scriticality. Jacksonville's total connected capacity equals 15000megabytes, but its demand for capacity is only 5000 megabytes, thus itsrelay ratio would R=(15,000/5000)=3. The same could be done with anairline network, where capacity is the total number of passengerslanding at the airport and demand are the number of passengers for whichJacksonville is their destination.

In step 1020, the nodes in the network are ranked based on their ratio Rof capacity to demand. The greater the ratio, the higher the rank. Thisapproach provides a rough indicator of how much built capacity exceedsthe consumption of capacity dictated by demand. A sample of the out putfor the hierarchy of step 1015 is displayed below.

Top Sixteen Nodes MSA Relay Ratio Kansas City 7.511627907 Salt Lake City3.395759717 Indianapolis 3.208191126 Seattle 2.962616822 Portland2.753665689 Sacramento 2.679577465 St. Louis 2.2382134 Denver1.951584507 Atlanta 1.882087099 Washington-Baltimore 1.795747423 Chicago1.712831503 Philadelphia 1.695364238 Orlando 1.485314685 Jacksonville1.45785877 Phoenix 1.201257862

Testing Node Criticality Ranking Hierarchies

The above hierarchies may be compared to determine which hierarchies aremost correct. In order to test the effect of the above hierarchies on anetwork, each hierarchy is subjected to simulations.

Accessibility Index. The most commonly used indicator of nodecriticality is the number of connections a node has, often called thedegree or the accessibility index. To provide a comparison to the newhierarchies outlined in this application, the accessibility index willbe calculated and plotted to provide a baseline. This allows ademonstration if the new hierarchies are doing better or worse thancurrent methods when the hierarchies are tested in the followingsection.

FIG. 11 illustrates the method of testing the effectiveness of the nodecriticality ranking hierarchies, according to one embodiment of theinvention. In step 1105, the network data is loaded into the system asone or more nodes. In step 1110, the criticality rankings produced byeach hierarchy are loaded into the system.

In step 1115, the diameter and S-I index of each node in each hierarchyis measured. Each node is successively removed according to its rank andthe diameter of the network and the S-I is measured for each removednode.

The diameter of the network is the minimum number of hops it takes toget from the two furthest nodes on the network. Mathematically this isexpressed as:

Diameter=maximum D_(ij)

where D_(ij)=shortest path (in links) between the ith and jth node.

Thus, for example, the longest shortest path in the city-to-city networkis Eugene, Oreg. to Ft. Myers Fla., which uses the following route:Eugene, Oreg. to Portland, Oreg. to Seattle, Wash. to Denver, Colo. toSt. Louis, Mo. to Atlanta, Ga. to Orlando, Fla. to Tampa, Fla. to Ft.Myer, Fla. The longest shortest path has seven hops, and thus thediameter of the network is seven.

The S-I index of a graph is based on the frequency distribution of theshortest path lengths s_(ij) in the graph. Mathematically, it is definedas the pair (S,I), where:

$\begin{matrix}{S = {{\frac{\mu_{3}}{\mu_{2}}\mspace{14mu} {and}\mspace{14mu} I} = \frac{\mu_{2}}{\mu_{1}}}} & (7)\end{matrix}$

In the above equation, μ₁ is the first moment (i.e., mean) of thefrequency of all shortest paths in the network, μ₂ is the second moment(i.e., variance) of the frequency of all shortest paths in the network,and μ₃ is the third moment (i.e., kurtosis) of the frequency of allshortest paths in the network. Once each moment for the network has beencalculated the S index is calculated by dividing the third moment by thesecond moment, and the I index is calculated by dividing the secondmoment by the first moment. For example in the city-to-city data networkμ₁=2.8274, μ₂=0.8324, and μ₃=0.0444. Thus S=0.0534 and I=0.2944 bothproviding a measure of connectivity for the network. As nodes areremoved from the network the connectivity decreases and the S and Iindex captures the loss quantitatively.

By examining the S-I index of the US IP network infrastructure as nodesare removed, one can obtain a quantitative indication of howdisconnected the network becomes.

The results of both the diameter and S-I index analysis can be found inthe example below.

Output of Diameter and S-I Index Analysis on Hierarchies

Diameter CMSA I = u2/u1 S = u3/u2 Binary Hierarchy 7 0.2937 0.0499 8Atlanta 0.3416 0.0927 8 Chicago 0.3445 0.0449 8 San Francisco 0.34660.0424 10 Dallas 0.4415 0.4056 10 Washington 0.4441 0.3019 10 New York0.4463 0.3133 10 Denver 0.4602 0.3656 10 Houston 0.5313 0.4742 10 KansasCity 0.5410 0.3871 10 Los Angeles 0.5085 0.2671 10 Cleveland 0.50370.2268 10 St. Louis 0.5096 0.1999 10 Salt Lake City 0.5069 0.1805 10Boston 2 0.5145 0.1185 10 Phoenix 0.5374 0.1309 Regional Hierarchy 70.2937 0.0499 8 New York 0.3029 0.0454 8 Chicago 0.3063 −0.0125 8 SanFrancisco 0.3155 −0.0468 8 Washington 0.3318 −0.0793 9 Boston 0.39380.2081 10 Dallas 0.4804 0.4802 10 Denver 0.4962 0.5025 10 St. Louis0.4982 0.4890 11 Cleveland 0.5915 0.6812 11 Louisville 0.5933 0.6759 11Kansas City 0.6600 0.5959 12 Seattle 0.7778 0.9118 12 Phoenix 0.77520.8822 12 Los Angeles 0.7622 0.8810 12 Atlanta 0.7656 0.4362

Distance Hierarchy Diameter CMSA I = u2/u1 S = u3/u2 7 0.2937 0.0499 8Salt Lake City 0.2935 0.0399 8 Denver 0.3003 0.0573 8 San Francisco0.3061 0.0246 9 Dallas 0.4081 0.5258 9 Seattle 0.4072 0.5149 9 Chicago0.4194 0.4465 9 Los Angeles 0.3841 0.2800 10 Atlanta 0.4205 0.1839 10Washington 0.4420 0.0249 10 New York 0.4394 −0.1134 10 Phoenix 0.4583−0.0784 11 Houston 0.5412 0.1520 13 Miami 0.7341 0.6719 14 Boston 0.95720.8135 16 Kansas City 1.3219 1.1954

Global Hierarchy Diameter CMSA I = u2/u1 S = u3/u2 7 0.2937 0.0499 8 SanFrancisco 0.2981 0.0258 8 Atlanta 0.3489 0.0779 8 Chicago 0.3518 0.016910 Dallas 0.4384 0.3208 10 Denver 0.4503 0.3691 10 Washington 0.47170.1825 10 New York 0.4672 0.0570 10 Salt Lake City 0.4649 0.0189 10 LosAngeles 0.4427 −0.0806 10 Houston 0.4932 −0.0264 11 Kansas City 0.5306−0.0190 11 Seattle 0.5317 −0.0705 12 Phoenix 0.6464 0.2665 13 Boston0.8097 0.3425 16 Miami 1.3219 1.1954

Relay Node Hierarchy Diameter MSA I = u2/u1 S = u3/u2 7 0.2937 0.0499 8Kansas City 0.2958 0.0405 8 Salt Lake City 0.2956 0.0302 8 Indianapolis0.2949 0.0227 8 Seattle 0.2942 0.0137 10 Portland 0.3654 0.6527 10Sacramento 0.3834 0.7821 St. Louis 0.3866 0.7927 10 Denver 0.4063 0.747010 Atlanta 0.4248 0.5493 10 Washington- 0.4254 0.4537 Baltimore 10Chicago 0.4285 0.3020 10 Philadelphia 0.4291 0.2970 10 Orlando 0.44120.2912 12 Jacksonville 0.5249 0.6122 12 Phoenix 0.5237 0.6021

The diameter results are the easiest to interpret and reveal someinteresting findings. The hierarchies with the largest effect on thediameter of the network were the distance hierarchy and the globalhierarchy, both of which ended in a diameter of 16 when the top 15 nodes(roughly 10%) were removed. The superior performance of the distancehierarchy confirmed that the best performing hierarchy would be onebased on Euclidean distance. The global hierarchy was based on thepresence of a large number of long distance links between two differentregions. While it did not directly use Euclidean distance there is anobvious correlation between global links between different regions and alonger physical length.

The starting diameter of the network in the case of both the distanceand global hierarchy was 7, and the end result of 16 was more than adoubling of the diameter. Thus, it took more than twice the number ofhops to reach the two furthest places on the network. This results in aripple effect across the network where it will take a minimum of twicethe time to get from any point to another. This does not take intoaccount the capacity of the links removed and how traffic will beredistributed across the network. While both hierarchies end up at 16the global hierarchy accelerates more rapidly in the beginning while thedistance hierarchy accelerates the diameter more quickly at the end ofthe nodal hierarchy. The next group of nodal hierarchies was the relaynode and regional hierarchy which both end with a diameter of 12.Finally, the binary and bandwidth capacity hierarchy had the leastimpact each ending in a diameter of 10.

In step 1120, the results of step 1115 are plotted in a graph form. Thegraph format allows a visual indication of which node ranking hierarchydoes a better job of identifying critical nodes in a network. The graphformat also gives an indication of when the network experiences acatastrophic failure, breaking apart into disconnected components. Thediameter relationship of the hierarchies is seen more clearly when allthe nodal hierarchies are plotted with their diameters at eachsuccessive node removal, as illustrated in FIG. 12.

The graph illustrates two aspects of network resiliency, the diameter ofthe network, and the point at which the network Balkanizes, indicativeof a catastrophic failure. The diameter of the network after eachsuccessive node removal is indicated by the number on the x axis. As thediameter increases, it is taking more hops to connect nodes in thenetwork indicating a decrease in efficiency and an increase in latency.Balkanization is indicated at the point that the diameter of networkstops increasing and drops of rapidly. At this point, the network hasbroken into two or more segments and the hierarchy takes the diameter ofthe largest remaining subgraph. Since the network has segmented intosmaller parts, the diameter decreases to match the network's now smallersize. Since the network has now fractured into segments that can nolonger communicate with each other, a catastrophic failure has occurred.When the hierarchies were compared using the above indicators, all thehierarchies outperformed the existing standard, the accessibility index.The global hierarchy reached the highest diameter, followed by thedistance hierarchy, and regional hierarchy. While the global hierarchyreached the highest diameter the distance hierarchy's case iscatastrophic Balkanization in the network first, closely followed by theglobal hierarchy and then the regional hierarchy. An examination of theS-I index confirms the findings of the diameter analysis.

FIG. 13 illustrates the S and I measure of the network as nodes areremoved from the network, using the global hierarchy approach. The graphformat clearly shows the similar effect S and I have with diameter asnodes are removed and the extreme sensitivity of S to network changes.The graphical approach is different from the typical plotting of the Sand I onto the S-I plane as (X,Y) coordinates, but works well in thiscase to one demonstrate the connection between diameter and the S-Imeasures, and two show how increases in the S-I index are indicators ofa disconnecting network.

Using Nodes to Define Regions

In the examples outlined above, a variety of hierarchies are used todetermine what nodes in a network are most critical. Many times, themost critical nodes in a network are already known and may not involveconnectivity or the measure outlined above. In this case it is useful toknow what regions are impacted by these critical nodes. As before, theregions defined by the hierarchy can be geographic (e.g., a critical hublocated in Atlanta) or non-geographic (e.g., a market or industrialsector).

FIG. 14 illustrates the method of defining regions by node connectivity,according to one embodiment of the invention. In step 1405, the networkdata is loaded into the system as one or more nodes. In step 1410, for anetwork N of nodes n, an adjacency matrix A and a distance matrix W aregenerated, based on the connectivity of the loaded data. Adjacencymatrix A is the connectivity matrix of the network being analyzed. Inthe city-to-city data network, this would be cities and the connectionsbetween them. If there were a connection between Jacksonville andAtlanta, one would be entered in matrix A. If there were no connection,then a zero would be entered.

The distance matrix W indicates the distance between any two directlyconnected nodes in matrix A. In the case of the city-to-city datanetwork, it is the number of miles between any two directly connectedcities. For example, a connection between Jacksonville and Atlanta, thedistance matrix, would have a value of 281 miles in the cell of thematrix representing the connection between Jacksonville and Atlanta. Themembers of matrix W represent the distance (e.g., physical distance,latency, or any other appropriate variable) between any two nodes of N.

In step 1415, the shortest path for each node in N is computed usingadjacency matrix A. This is done by calculating the shortest number ofhops to connect a single node individually with every other node in thenetwork. This process is repeated for every node in the network, thusproviding the shortest paths for each node in the network N.

In step 1420, the number of connections for each node in N isdetermined, and the nodes are ranked in descending order. Thus, assumingthat the adjacency matrix A is symmetric, either egress or ingressconnections c(i) are computed for each node i of N. These nodes are thenranked in descending order by ingress (egress) connections c.

In step 1425, a set m<n of an arbitrary number of top ranked nodes(e.g., such as, but not limited to, the nodes ranked in step 1420) iscreated. Thus, for example, the set of nodes could be n={New York,Washington, San Francisco, Seattle, Atlanta}, and the set m of topranked nodes could be m={New York, Washington, San Francisco}. Selectingthe number of hubs in the network is left to the user's discretion. Theuser can use one of the ranking hierarchies outlined above, or their ownqualitative measures based on insider knowledge of a network. Thus, thenumber is arbitrary to the demands of the user, and which nodes in thenetwork they determine to be critical.

In step 1430, for each member in the m set of nodes (e.g., hubs), a listof nodes that are one hop, two hops, three hops, etc. away from eachother, is generated. Thus, for node j in the set m, lists L_(r)(j)(e.g., of nodes that are 1, 2, . . . s hop distant from node j) andrε[1, 2, . . . s] are created.

In step 1435, each node in the network follows its available shortestpath until a node j in the m set of nodes is reached. This can becalculated by setting

${R_{j} = {\sum\limits_{r}\; {L_{r}(j)}}},$

where R_(j) represents a region around node j, which is included in theset m (i.e., jεm). In the city-to-city example, Atlanta, Ga.,Washington, D.C., St. Louis, Mo., and San Francisco, Calif. all containcritical data warehouses designated as critical hubs by a firm. Theregion impacted by the loss of a data warehouse could then beascertained using this hierarchy, by determining which nodes fall undera particular data warehouses region of connectivity. When Jacksonville'sshortest path is calculated to all the hubs in the list, it is two hopsfrom Atlanta, three hops from Washington, four hops from St. Louis andsix hops from San Francisco. Thus the hierarchy would place Jacksonvilleas belonging to Atlanta's region.

Starting with the highest ranked node of set m, the list of nodes thatare s hops away from node j (i.e., L_(r)(j)) is compared to the list ofnodes that are s hops away from node k (i.e., L_(r)(k)), where k is notone of the highest ranking nodes included in the set m (i.e., k≠j).One-hop connections (if there are any) between the top nodes in the setof m nodes are not included.

In step 1440, if there are two or more nodes in the set reachable fromequal shortest paths, this tie is broken by determining which node ismore proximate. Proximity can be defined by distance, capacity, latency,or any other appropriate metric. Thus, if there is a common node q thatis r hops away from both j and k, then the physical distances d_(jq) andd_(kq). between nodes j to q and k to q from the distance matrix W arecompared. If d_(jq)<=d_(kq) then node q belongs to the list L_(r)(j) orregion R_(j), whose members are exactly r hops away from node jεm. Ifd_(jq)>d_(kq), then q belongs to the list L_(r)(k) region R_(k), whosemembers are exactly r hops away from node kεm.

Building on the data warehouse example, let Charlotte be assigned to aregion and be two hops from Washington, two hops from Atlanta, threehops from St. Louis, and five hops from San Francisco. Because there isa tie between Washington and Atlanta, the tiebreaker would be done basedon which city is closer to Charlotte. In the case of Euclidean distance,matrix W would be referenced, and the lower value would be selected(i.e., Washington is 350 miles from Charlotte, but Atlanta is only 200miles from Charlotte.) Thus Charlotte would be placed in Atlanta'sregion. As with the distance hierarchy, different values can be used toindicate distance between two nodes (e.g., capacity, flow, etc.).

In step 1445, each node is placed in a set under its designated hub andattached to an attribute indicating how many hops the node is from itsdesignated hub. In the data-warehousing example, both Charlotte andJacksonville would have a two attributed to them, because they were bothtwo hops away from Atlanta. Each of these lists comprises a region thatcan be mapped, as illustrated in FIG. 15.

In this example, nodes that are one hop from the regional hub are giventhe hubs abbreviated name (i.e., ATL=Atlanta) and cities that are morethan one hop away are designated by the abbreviated name followed by thenumber of hops (i.e., ATL2=two hops away from Atlanta). It should benoted that the distance variable could be substituted with a bandwidthcapacity variable, or other variable of the user's choice, as best fitsthe hierarchy's application. In this case, distance was used becausenetwork design most often incorporates a distance cost variable whenselecting link build outs.

CONCLUSION

While various embodiments of the present invention have been describedabove, it should be understood that they have been presented by way ofexample, and not limitation. It will be apparent to persons skilled inthe relevant art(s) that various changes in form and detail can be madetherein without departing from the spirit and scope of the presentinvention. Thus, the present invention should not be limited by any ofthe above-described exemplary embodiments.

In addition, it should be understood that the Figures described above,which highlight the functionality and advantages of the presentinvention, are presented for example purposes only. The architecture ofthe present invention is sufficiently flexible and configurable, suchthat it may be utilized in ways other than that shown in the Figures.

Further, the purpose of the Abstract is to enable the U.S. Patent andTrademark Office and the public generally, and especially thescientists, engineers and practitioners in the art who are not familiarwith patent or legal terms or phraseology, to determine quickly from acursory inspection the nature and essence of the technical disclosure ofthe application. The Abstract is not intended to be limiting as to thescope of the present invention in any way.

1. A method for analyzing a structure of a network, comprising: enteringdata form the network into a system as one or more nodes; designatinglinks between nodes as global or local; and ranking the nodes utilizinglink designation as global or local.